Bayesian Optimal Design of Validation Experiments
نویسندگان
چکیده
An important concern in the design of validation experiments is how to incorporate the mathematical model in the design in order to allow conclusive comparisons of model prediction with experimental output in model assessment. In this paper, an integrated Bayesian cross entropy methodology is proposed to perform the optimal design of validation experiments incorporating the computational model. The expected crossentropy, an information theoretic distance between the distributions of model prediction and experimental observation, is defined as a utility function to measure the similarity of two distributions. A simulated annealing algorithm is used to find optimal values of input variables through minimizing or maximizing the expected crossentropy. The measured data after testing with the optimum input values is used to update the distribution of the experiment output using Bayes theorem. The procedure is repeated to adaptively design the required number of experiments for model assessment, each time ensuring that the experiment provides effective comparison for validation. The methodology is illustrated for the optimal design of validation experiments for a composite helicopter rotor hub component. MOTIVATION Model validation, the process of comparing model predictions with experimental observations, can be either qualitative or quantitative. Ongoing research at Vanderbilt is focused on quantitative methods to determine the validity or predictive capability of a mathematical model. Two key challenges in the validation process are the definition of validation metrics and the design of validation experiments [2]. A validation metric is the measure of agreement between model predictions and experimental observations. In recent years, several studies are investigating the fundamental concepts and methodologies for validation of large-scale computational models, such as by the United States Department of Defense [3], American Institute of Aeronautics and Astronautics [4], Accelerated Strategic Computing Initiative (ASCI) program of the United Sates Department of Energy [5], and 1 This paper is a shorter version of the main paper [1] published in journal of measurement science and technology. American Society of Mechanical Engineers Standards Committee [6] on verification and validation in computational solid mechanics. More recently, the Vanderbilt team has developed a Bayesian hypothesis testingbased model validation methodology [7−13]. The differences between classical and Bayesian hypothesis testing have also been discussed in detail by many researchers (e.g., [14, 15]). Another model validation approach is the use of decision-theoretic utility or loss functions. Jiang and Mahadevan [16] developed a Bayesian risk-based decision making methodology for computational model validation, considering the risk of using the current model, data support for the current model, and cost of acquiring new information to improve the model. In this paper, we focus on the design of validation experiments targeted toward the specific computational model used, and consider the uncertainty in both model prediction and experimental observation. Experimental design is a decision theoretic problem where a utility function needs to be first defined to specify the desired benefit of the experiment outcome and evaluate different designs, and then optimized with respect to different design treatments. In designing an experiment, decisions must be made before data collection, such as choosing which treatments to study, defining the treatments precisely, choosing how to randomize the treatments, specifying a length of time for a time-dependent experiment, specifying the number of replicated experiments, etc. Different from conventional experiments which are suitable for phenomena discovery, i.e., a basic understanding of physical processes, however, a validation experiment is to be designed for a clearly defined purpose, namely computational model assessment. The challenge of designing a validation experiment is therefore how to effectively target the experiment towards assessment of the specific model used. During the past several years, fundamental concepts and guidelines have been suggested for the design of validation experiments (e.g., [4, 17, 18, 19, 20]). However, no specific quantitative experimental design method has been developed yet for model validation. Nevertheless, it has been pointed out that (1) a validation experiment should be designed and executed to allow precise and conclusive comparisons of model prediction with experimental data for the purpose of assessing the model’s predictive capabilities, and (2) optimal design of validation experiments needs to be done in order to create the greatest opportunities for performing these comparisons. The goal of this study is to find the optimum values of the input variables so that the data collected from the experiment provides the greatest opportunity for performing conclusive comparisons in model validation. The cross-entropy between the probability distributions of model prediction and experimental observation, an information theoretic distance measure, is proposed to achieve this objective. The expected cross-entropy is first defined to measure the distance between the probability distribution of model prediction and that of experiment observation. Before testing, a prior distribution is assumed for the experiment output because no prior information is available. The value of the expected cross-entropy between the two distributions is calculated using numerical simulation. Optimal distributions of random input variables are then obtained by minimizing/maximizing the expected cross-entropy using a simulated annealing algorithm. After one test is conducted with the mean values of the optimum distributions of the input random variables, the resulting experiment output is used to assess the computational model and to update the probability distribution of the experimental output using Bayes theorem. The updated experimental output distribution is used in the optimum design of the next experiment. The procedure is repeated until the confidence measure of model prediction reaches a stable value. The proposed optimal design methodology is illustrated with a composite helicopter rotor hub component. CROSS-ENTROPY FOR VALIDATION EXPERIMENT DESIGN Shannon’s entropy is generally defined as a function of a probability density function (PDF) p(y) and is written in the following expectation form [ ] ) ( log ) ( log ) ( ) ( y p E dy y p y p Y S b p b − = − = ∫ (1) where the quantity is interpreted as the information content of the outcome , in which Y is any random physical variable in practical engineering problems. The base of the logarithm b is usually taken as 2, in which case the entropy is measured in “bits”. In that case, the ) ( log y p b − Y y ∈ unit of information, the bit, refers to information resulting from an occurrence of one of two equally probable alternatives. In practical applications, b = e is also used in which case the entropy is measured in “nats”. Shannon's entropy calculated by Eq. (1) is defined to be the average amount of information contained in the random variable Y [21]. It should be noted that the entropy of Y does not depend on the actual values of y, but only on its distribution p(y). In practical applications, the probability distribution of the experimental output usually strongly depends on the PDFs of random input variables (treatments). Accordingly, information represented by Shannon’s entropy is gained through the choice of probabilistic distributions about the input variables under given constraints. In general, the uncertainty in a given system is reduced by obtaining more and more information related to the system. It has been demonstrated by Kapur and Kesavan [22] that maximizing Shannon’s entropy under given constraints gives rise to globally maximum value of the information. Recently, this feature of maximum Shannon’s entropy has been applied for the optimal design of conventional experiments with the purpose of maximizing the information gain in experimental observations (e.g., [23−25]). The basic idea behind the maximum entropy method is to select the optimal distribution parameters of the input that will maximize the expected entropy (i.e., information) of experimental outcome. In the subsequent section, the cross-entropy concept is developed for the optimal design of a validation experiment by including the computational model, which can not be realized in the Shannon’s entropy method. The cross-entropy, also called Kullback-Leibler (KL) distance, was proposed by Kullback-Leibler [26] to measure the similarity between a true probability distribution and an estimated probability distribution. Based on this, an objective function can be defined in terms of the expected cross-entropy between the density of model prediction and that of experimental observation. The lower the expected cross-entropy, the closer is the distribution of the experiment output to that of the model prediction. Therefore, minimizing the expected cross-entropy leads to the maximal similarity between the two distributions and conversely. Let and denote the probability distributions of model prediction and experimental observation, respectively, with regard to a random physical variable y. The cross-entropy or KL distance in units of “nats” is defined as ) (y g ) (y h ∫ ∫ ∫ − = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = dy y h y g dy y g y g dy y h y g y g h g D )] ( ln[ ) ( )] ( ln[ ) ( ) ( ) ( ln ) ( ) , ( (2) where ln(z) computes the natural logarithm of z. Based on the concept of Shannon’s entropy described previously, Equation (2) computes the difference of the expected information between two distributions, i.e. , in which the random variable y is omitted in the probabilistic distributions g and h for clarity. It should be noted that is not a physical distance between g(y) and h(y) in the common sense. The proof is shown in Krantz [ )] [ln( )] [ln( ) , ( h E g E h g D g g − =
منابع مشابه
An Efficient Bayesian Optimal Design for Logistic Model
Consider a Bayesian optimal design with many support points which poses the problem of collecting data with a few number of observations at each design point. Under such a scenario the asymptotic property of using Fisher information matrix for approximating the covariance matrix of posterior ML estimators might be doubtful. We suggest to use Bhattcharyya matrix in deriving the information matri...
متن کاملUsing design of experiments approach and simulated annealing algorithm for modeling and Optimization of EDM process parameters
The main objectives of this research are, therefore, to assess the effects of process parameters and to determine their optimal levels machining of Inconel 718 super alloy. gap voltage, current, time of machining and duty factor are tuning parameters considered to be study as process input parameters. Furthermore, two important process output characteristic, have been evaluated in this research...
متن کاملBayesian risk-based decision method for model validation under uncertainty
This paper develops a decision-making methodology for computational model validation, considering the risk of using the current model, data support for the current model, and cost of acquiring new information to improve the model. A Bayesian decision theorybased method is developed for this purpose, using a likelihood ratio as the validation metric for model assessment. An expected risk or cost...
متن کاملA New Acceptance Sampling Design Using Bayesian Modeling and Backwards Induction
In acceptance sampling plans, the decisions on either accepting or rejecting a specific batch is still a challenging problem. In order to provide a desired level of protection for customers as well as manufacturers, in this paper, a new acceptance sampling design is proposed to accept or reject a batch based on Bayesian modeling to update the distribution function of the percentage of nonconfor...
متن کاملImprove Estimation and Operation of Optimal Power Flow(OPF) Using Bayesian Neural Network
The future of development and design is impossible without study of Power Flow(PF), exigency the system outcomes load growth, necessity add generators, transformers and power lines in power system. The urgency for Optimal Power Flow (OPF) studies, in addition to the items listed for the PF and in order to achieve the objective functions. In this paper has been used cost of generator fuel, acti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006